P2P RVM for Distributed Classification
نویسندگان
چکیده
In recent years there is an increasing interest for analytical methods that learn patterns over large-scale data distributed over Peer-to-Peer (P2P) networks and support applications. Mining patterns in such distributed and dynamic environment is a challenging task, because centralization of data is not feasible. In this paper, we have proposed a distributed classification technique based on Relevance Vector Machines (RVM) and local model exchange among neighboring peers in a P2P network. In such networks, the evaluation criteria for an efficient distributed classification algorithm is based on the size of resulting local models (communication efficiency) and their prediction accuracy. RVM, utilizes dramatically fewer kernel functions than a state-of-the-art ‘support vector machine’ (SVM), while demonstrating comparable generalization performance. This makes RVM a suitable choice to learn compact and accurate local models at each peer in a P2P network. Our model propagation approach, exchange resulting models with peers in a local neighborhood to produce more accurate network wide global model, while keeping the communication cost low throughout the network. Through extensive experimental evaluations, we demonstrate that by using more relevant and compact models, our approach outperforms the baseline model propagation approaches in terms of accuracy and communication cost.
منابع مشابه
P2P Network Trust Management Survey
Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...
متن کاملP2p Semantic Search P2p Semantic Search
We consider P2P Semantic Search as a process of finding documents, which are semantically, i.e., with respect to the meaning, related to the user information needs, in a document collections distributed among a group of peers, i.e., autonomous information sources. To organize documents stored on a single peer efficiently for the search, documents are classified to the user-generated classificat...
متن کاملOnline Network Traffic Classification Algorithm Based on RVM
Since compared with the Support Vector Machine (SVM), the Relevance Vector Machine (RVM) not only has the advantage of avoiding the overlearn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisf...
متن کاملA comparison of SVM and RVM for Document Classification
Document classification is a task of assigning a new unclassified document to one of the predefined set of classes. The content based document classification uses the content of the document with some weighting criteria to assign it to one of the predefined classes. It is a major task in library science, electronic document management systems and information sciences. This paper investigates do...
متن کاملSatrap: Data and Network Heterogeneity Aware P2P Data-Mining
Distributed classification aims to build an accurate classifier by learning from distributed data while reducing computation and communication cost. A P2P network where numerous users come together to share resources like data content, bandwidth, storage space and CPU resources is an excellent platform for distributed classification. However, two important aspects of the learning environment ha...
متن کامل